CONCORD: a consensus method for protein secondary structure prediction via mixed integer linear optimization
نویسندگان
چکیده
Most of the protein structure prediction methods use a multi-step process, which often includes secondary structure prediction, contact prediction, fragment generation, clustering, etc. For many years, secondary structure prediction has been the workhorse for numerous methods aimed at predicting protein structure and function. This paper presents a new mixed integer linear optimization (MILP)-based consensus method: a Consensus scheme based On a mixed integer liNear optimization method for seCOndary stRucture preDiction (CONCORD). Based on seven secondary structure prediction methods, SSpro, DSC, PROF, PROFphd, PSIPRED, Predator and GorIV, the MILPbased consensus method combines the strengths of different methods, maximizes the number of correctly predicted amino acids and achieves a better prediction accuracy. The method is shown to perform well compared with the seven individual methods when tested on the PDBselect25 training protein set using sixfold cross validation. It also performs well compared with another set of 10 online secondary structure prediction servers (including several recent ones) when tested on the CASP9 targets (http://predictioncenter.org/casp9/). The average Q3 prediction accuracy is 83.04 per cent for the sixfold cross validation of the PDBselect25 set and 82.3 per cent for the CASP9 targets. We have developed a MILP-based consensus method for protein secondary structure prediction. A web server, CONCORD, is available to the scientific community at http://helios.princeton.edu/CONCORD.
منابع مشابه
ASTRO-FOLD : First Principles Tertiary Structure Prediction
ASTRO-FOLD is an integrated methodology for the first principles structure prediction of proteins based on an overall deterministic global optimization framework coupled with mixed-integer optimization. The novel four-stage approach combines the classical and new views of protein folding, while using free energy calculations and integer linear optimization to predict the location of helical seg...
متن کاملMATHEMATICAL ENGINEERING TECHNICAL REPORTS Design of Compliant Mechanisms with Standardized Beam Elements via Mixed-Integer Programming
Design of compliant mechanisms requires both giving a structure flexibility that produces the kinematic performance and assuring stiffness to resist against structural failure. Presented in this paper is a mixed-integer programming approach to design optimization of a compliant mechanism realized as a frame structure consisting of standardized beam elements. In the optimization problem, the loc...
متن کاملNovel consensus quantitative structure-retention relationship method in prediction of pesticides retention time in nano-LC
In this study, quantitative structure-retention relationship (QSRR) methodology employed for modeling of the retention times of 16 banned pesticides in nano-liquid chromatography (nano-LC) column. Genetic algorithm-multiple linear regression (GA-MLR) method employed for developing global and consensus QSRR models. The best global GA-MLR model was established by adjusting GA parameters. Three de...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملβ-sheet Topology Prediction with High Precision and Recall for β and Mixed α/β Proteins
The prediction of the correct β-sheet topology for pure β and mixed α/β proteins is a critical intermediate step toward the three dimensional protein structure prediction. The predicted beta sheet topology provides distance constraints between sequentially separated residues, which reduces the three dimensional search space for a protein structure prediction algorithm. Here, we present a novel ...
متن کامل